GH-73435: Implement recursive wildcards in pathlib.PurePath.match()#101398
Merged
barneygale merged 40 commits intopython:mainfrom May 30, 2023
Merged
GH-73435: Implement recursive wildcards in pathlib.PurePath.match()#101398barneygale merged 40 commits intopython:mainfrom
pathlib.PurePath.match()#101398barneygale merged 40 commits intopython:mainfrom
Conversation
…ch() Add a new *recursive* argument to `pathlib.PurePath.match()`, defaulting to `False`. If set to true, `match()` handles the `**` wildcard as in `Path.glob()`, i.e. it matches any number of path segments. We now compile a `re.Pattern` object for the entire pattern. This is made more difficult by `fnmatch` not treating directory separators as special when evaluating wildcards (`*`, `?`, etc), and so we arrange the path parts onto separate *lines* in a string, and ensure we don't set `re.DOTALL`.
Contributor
Author
|
|
This was referenced Mar 12, 2023
pathlib.PurePath.match()
zooba
reviewed
May 30, 2023
Member
zooba
left a comment
There was a problem hiding this comment.
Approved, but consider adding a couple of comments (as suggested) so that the next person who has to trace through this code is grateful rather than mad at you ;-)
Member
|
Perfect! Ship it |
AlexWaygood
reviewed
May 30, 2023
Co-authored-by: Alex Waygood <[email protected]>
Contributor
Author
|
Thank you for your help Alex, Hugo and Steve! |
This was referenced Jul 2, 2023
Contributor
Author
|
Hey, if it interests anyone, I have a follow-up PR that simplifies a bunch of the code added in this PR. It does this by adding a new seps parameter to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PurePath.match()now handles the**wildcard as inPath.glob(), i.e. it matches any number of path segments.We now compile a
re.Patternobject for the entire pattern. This is made more difficult byfnmatchnot treating directory separators as special when evaluating wildcards (*,?, etc), and so we arrange the path parts onto separate lines in a string, and ensure we don't setre.DOTALL.This improves performance of
match()around 2x-3x times for simple patterns, and more for complex patterns:$ ./python -m timeit \ -s 'from pathlib import PureWindowsPath as P; path = P("C:/foo/bar.py"); pattern = P("c:/*/*.py")' \ 'path.match(pattern)' 50000 loops, best of 5: 8.13 usec per loop # before 1000000 loops, best of 5: 297 nsec per loop # after